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Label Length and Title Type as Deter minan ts in Visitor Learning 

Jane Marie Litwak 
University of Minnesota 
Minnesota Historical Society 



Abstract 

This dissertation applied concepts from cognitive psychology to the design of 
museum exhibit labels in an effort to increase learning and memory in museum 
settings. A series of three studies focused on factors affecting whether or not using a 
question (instead of a statement) as a title on a museum exhibit label would increase 
the visitor’s memory of the information presented in the label text. The first study is 
being published separately by Litwak, Bielinski, & Samuels. The other two studies 
form the basis of the dissertation. In all three experiments the labels accompanying 
the bird dioramas at the Bell Museum of Natural History at the University of 
Minnesota were rewritten. Subjects (college students) visited the museum at their 
leisure. At the end of their visit they were surprised with a multiple choice test 
(dependent variable) on the content of the experimental labels (independent 
variable). Evidence was found that questions increased learning. 



Introduction 

Museums are not often optimal learning environments. They can be distracting 
settings, overloaded with information and crowded with visitors seeking to fulfill 
personal and social agendas. Museum professionals lament the fact that many 
visitors spend relatively little time at each exhibit component and usually only stop 
at a small portion of the displays in the museum. Under these conditions it is not 
surprising that the information presented in the exhibits does not find its way into 
the long-term memory of the visitors. To truly learn and remember new material, 
the learner must focus on, elaborate, organize, and rehearse the information. 

Visitors must form personally meaningful associations with the exhibit content and 
link it to their pre-existing schemas. This takes time and energy and often fails to 
occur in a museum setting. 

In an attempt to combat these overwhelming odds, museums employ a variety of little 
psychological tricks to catch and keep visitors’ attention. One of these tricks is to 
pose questions to visitors on the exhibit labels. The philosophy behind this method is 
that once the visitors' curiosity is peaked, they will read the label and learning will 
naturally follow (Screven, 1986, 1992; Rand, 1985). Screven and Hirschi (1988) found 
that adding a question to an exhibit label increased time spent at the exhibit from 6.6 
seconds to 95 seconds. However, no study to date has sought to measure the increase 
in learning that results from the addition of the question. 

Studies in classroom settings, by contrast, have been much more focused on how 
questions can increase learning. While classroom teachers and textbooks do 
sometimes use questions to pique curiosity, questions are more often used to guide the 
cognitive processes of the students. While most studies to date have focused on 
teacher-student discourse, headway has been made in understanding the role of 
questions in text. Leonard (1987) found that students who read texts with questions at 
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the beginning of each paragraph scored significantly higher on achievement tests 
than did the students who read the same text without the questions. Friedman (1981) 
found that inferential questions inserted in the text at the end of paragraphs 
prompted higher achievement than literal questions. 

In applying these classroom findings to a museum setting we need to explore two 
factors: "What type of question, when posed on a label, facilitates the most learning 
in a museum setting?" and “Under what conditions do we encounter increased 
learning as a result of the new labels?” The following three studies were designed to 
explore these questions. 



Study One 

Goals 

The goal of Study One was to compare the teaching efficacy of three different types 
of questions that could be posed on the labels: explicit, implicit, and scriptual. This 
typology looks at how the question is answered in the label text for the visitor. 
Straight-out, word-for-word answers are called explicit. Answers that are alluded to 
but must be pieced together by the visitor are called inferential. Questions that ask 
about the experiences and opinions of the visitors and thus are not answered in the 
text are called scriptual. Our hypothesis was that the implicit questions would 
produce better recall of the information on the labels because they force the visitor 
to manipulate the information more thoroughly in order to arrive at an answer. 

Methods 

Site and Materials. The experiment took place in the Bird Hall at the Bell Museum of 
Natural History located on the University of Minnesota main campus. The labels 
accompanying the eight largest dioramas in this gallery were rewritten to focus on 
the most prominent species of bird in the display. Each of the experimental labels 
consisted of a title, a simple text presenting 3-4 facts about the chosen species of bird 
in the diorama, and a line drawing of the target bird. The average label was 50 words 
long and was written at a 7th grade reading level. The text and questions were 
created by the author and edited by the curators of the Bell Museum. Labels were 
laser printed in New York font, 24 point for the text and 36 point for the titles, on 8.5” 
x 14” white paper and inserted into backlit panels of the same size built-in along side 
the dioramas. 



The experimental conditions were created by manipulating the type of question 
which appeared at the top of the labels. The text and drawing for each label 
remained the same for each condition. A new round of experimental labels was 
installed each week, the order having been determined by a random drawing. The 
four experimental conditions were: 

Week One: Explicit Question (Answer stated explicitly in the label text) 

Week Two: Implicit Question (Answer implied, not directly stated) 

Week Three: Scriptual Question (Answer not stated or implied) 

Week Four: Statement Title (No question, only the name of the bird) 

For example, the text for the exhibit with the Burrowing Owl was: 

The Burrowing Owl builds its nest underground. These owls could dig 
their own burrows, but they usually use the abandoned den of a prairie 
dog or pocket gopher. When a predator approaches, the Burrowing Owl 
dives into its underground nest. 
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The experimental titles were: 

Explicit Question: Could a Burrowing Owl dig its own burrow? 

Implicit Question: Are burrows a safe place for Burrowing Owls to hide? 

Scriptual Question: Would you like to live in an underground nest? 

Statement Title: Burrowing Owl 

A 24 item multiple choice quiz was then created to test the subjects’ memory of the 
facts presented on the labels. Each label text contained three facts: one that was the 
answer to the explicit question, one that was the answer to the implicit question, and 
one additional fact. Thus three multiple choice questions could be created for each of 
the eight labels. This quiz was informally pretested on several small samples of 
graduate students and revised until each response foil for each question was 
endorsed by approximately one quarter of the respondents. 

A second instrument was created to learn about the behaviors and affective 
responses of the subjects that might have had an impact on their memory of the 
exhibit labels. These eight items, which were administered before the quiz, asked the 
subjects to rate their opinions or behavior on a five point scale (Not at all.. .A 
little.. .Very much; First. ..Middle. ..Last; etc.) 

1. How much did you enjoy your visit at the Bell Museum today? 

2. How much did you enjoy the large, bird dioramas in particular? 

3. How many of the labels at the bird dioramas did you actually read? 

4. How interesting and entertaining were the labels at the bird dioramas? 

5. How much do you feel you learned from the labels at the bird dioramas? 

6. When during your visit did you view the bird dioramas? 

7. Did you visit the museum today alone, or with a friend? (yes/no) 

8. Had you ever been to the Bell Museum before today? (yes/no) 

Subjects and Procedures. A total of 157 undergraduate and graduate students 
participated in the study. The subjects were assigned randomly to one of the four 
treatment groups that visited the museum during their assigned week or to the 
control group that did not visit the museum. Subjects were told that this experiment 
was about the effect of the ambient environment on visitor enjoyment of a museum 
visit. They were asked to visit the museum any time during the week that they were 
assigned and just “wander around and have a good time”. They could bring a friend 
if they wished, but not a child under the age of 12. Subjects were told to check in and 
out at the front desk. When they checked in, they were told by the cashier to be sure 
to see the bird gallery. When they checked out, the cashier sat them down in an 
office and gave them the behavior & opinion survey and the quiz. Subjects were 
asked not to discuss their experiences with their classmates until the end of the 
quarter. The subjects in the control group were asked not to visit the museum at all 
that quarter. They were held after class one day during Week One and were given 
the quiz on the facts on the labels that the other subjects would take. Three weeks 
after each group visited the museum they took a follow-up test in a classroom setting. 
This was the same 24 item multiple choice quiz that they had taken at the museum. 

The control group also retook the quiz three weeks after their original testing. 

Results 

An initial ANOVA showed significant differences between the treatment groups on 
both the initial quiz (F = 15.26, p = 0.000) and the follow-up quiz (F = 9.26, p = 0.000). A 
set of orthogonal contrasts confirmed that the Control group scored significantly 
lower than the four treatment groups on the initial quiz (t = 7.41, p = 0.000) and the 
follow up quiz (t = 5.57, p = 0.000). The Statement group also scored significantly 
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lower than the three question groups (Implicit, Explicit, Scriptual) on the initial quiz 
(t = 2.59, p = 0.010) and the follow-up quiz (t = 2.28, p = 0.024). There were no 
significant differences between the means of the Implicit, Explicit, and Scriptual 
groups on either of the quizzes. A separate ANOVA showed that while the mean 
scores of all four treatment groups declined significantly from the initial test to the 
follow-up test, there was not a significant difference in the amount of decline. 



Mean Scores for all Treatment 
Treatment Group 
Implicit 
Explicit 
Scriptual 
Statement 
Control 



Groups on the Initial and Follow-up Qjiizzes 
Initial Score Follow-up Score 



14.0 (58%) 
14.3 (60%) 

15.1 (63%) 

12.2 (51%) 
7.2 (30%) 



12.6 (53%) 

12.4 (52%) 

14.5 (60%) 
11.2 (47%) 

8.1 (34%) 



Of the behavior and opinion questions only two were significantly correlated to the 
quiz scores. Subjects who reported having read more labels scored higher than those 
who read fewer on the initial quiz (r = .52) and the follow-up (r = .46). Subjects who 
gave higher ratings on how “interesting and entertaining” the labels were scored 
higher on the initial quiz (r = .35) and the follow-up (r = .30) than those who gave 
lower ratings. Given these findings it was concluded that posing a question on a 
label may pique visitors’ interest and prompt them to read the label thus resulting in 
more learning, but further evidence would be needed to show that the question 
guided the learning process. 



Study Two 

Goals 

The goal of this study was to replicate the question vs. statement findings of Study 
One under slightly different conditions: instead of comparing subjects who had read 
only labels beginning with questions to subjects who had read only labels beginning 
with statements, all subjects would be exposed to an exhibit hall in which half of the 
labels began with questions and half began with statements. Study Two also 
controlled for the factor of subject motivation. It was hypothesized that visitors who 
were cued to study the labels would remember the content of both types of labels 
equally while the uncued visitors would have better memory of the information from 
labels that began with questions. Given the results of Study One, no effort was made 
to create different types of questions for the labels. 

Methods 

Site and Materials. The labels at all ten bird dioramas at the Bell Museum of Natural 
History were again rewritten, but this time with more input from “visitors” and 
museum staff. Twelve graduate students visited the museum and listed all the 
questions they had about the dioramas, then 32 of their classmates then rated the 
interest level of these questions on a scale of one to five. Eight of the staff at the Bell 
Museum rated the same questions on a scale of one to five on their appropriateness as 
topics for labels for the dioramas. 



The question for each diorama that rated highest in both visitor interest and staff 
approval was chosen for the experiment and developed into a label. The question was 
answered in the first paragraph of the label and the topic was further developed in a 
second paragraph. Half of the labels were then randomly chosen to be “Statement” 
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labels and the question title was shortened into a brief descriptive statement. For 
example, “Do Wood Ducks live in trees?” became “Wood Duck Homes.” 

A set of “good label criteria” , gleaned from numerous articles on label writing, was 
created and followed. For a review of the burgeoning literature on how to write good 
labels, see Mackinney (1993). All labels were 85-100 words long and were written at 
the 7th or 8th grade level (7.0 - 8.9 on the Flesch Scale). They were printed in the 
same manner as in study one, but did not include illustrations. Twenty one 
undergraduate students rated the interest level of the label drafts and the text was 
revised in response to their comments. 

The 30 item multiple choice quiz for this study consisted of three questions from each 
diorama: one fact from the first paragraph of the label, one fact from the second 
paragraph, and one item based on visual memory of the diorama. An example of a 
visual memory item would be: 

In the exhibit with a family of wood ducks in a forest, the male Wood Duck was 

a. floating in the water. 

b. standing next to the female Wood Duck. 

c. peeking out of the nest, 
d sitting on a tree branch. 

A draft of the quiz was pretested on six adults, revised, and then formally tested on 2 1 
undergraduates to ensure that adults who had not seen the labels scored at chance 
level on the quiz and that the response foils were chosen with approximately equal 
frequency. 

The following multiple choice behavior and opinion items also appeared on the quiz: 

1. Did you visit the museum today alone, or with a friend? 

2. Had you ever been to the Bell Museum before today? 

3. How often do you usually visit museums? 

4. How much did you enjoy your visit at the Bell Museum today? 

5. When during your visit did you view the bird dioramas? 

6. How many of the labels at the bird dioramas did you actually read? 

7. How interesting and entertaining were the labels at the bird dioramas? 

8. How much did you know about birds before today? 

Subjects and Procedures. A total of 56 graduate students (from a different program 
than those who had helped create the experimental labels) participated in the main 
portion of the study. Each subject was assigned randomly to one of the two treatment 
groups (cued, uncued) that visited the museum. Twenty one undergraduates were 
used as a comparison group. These students read the label text in a classroom setting 
and then immediately took the quiz. 

The ten experimental labels, half with questions as titles and half with brief 
descriptive statements, were installed and remained in place throughout the data 
collection period. Subjects were asked to stop at the admission desk for instructions 
when they arrived at the museum. Those arriving on even numbered calendar dates 
were given written instructions to study the labels at the bird displays for a quiz that 
they would take. Subjects arriving on odd numbered calendar dates were told to 
“have a good time.” All subjects were quizzed just before exiting the museum. There 
was no follow-up test. 
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Results 



This time the ANOVA showed no significant difference between the learning 
effectiveness of questions and statements (F = 1.17, p = 0.285). Surprisingly, the 
differences between the quiz scores of the cued and uncued subjects were also 
insignificant (F = 1.02, p = 0.317) as was the interaction effect (F =1.26, p = 0.266). 

Scores of Cued and Uncued Subjects on Qjaiz Items 
from Labels with Questions and Labels with Statements 

Treatment Score on O Items Score on S Items 

Cued 7.3 (49%) 8.1 (54%) 

Uncued 7.0 (47%) 7.0 (47%) 

However, both cued and uncued subjects scored significantly better on test items 
taken from the first paragraph of the label than on items taken from the second 
paragraph or from the visual display (F = 14.79, p = 0.000). 

Scores of Cued and Uncued Subjects on Quiz Items from the 
First and Second Paragraphs of the Labels and the Visual Displays 

Treatment Par 1 Items Score Par 2 Items Score Visual Items Score 
Cued 5.8 (58%) 4.8 (48%) 4.9 (49%) 

Uncued 5.8 (58%) 4.1 (41%) 4.2 (42%) 

Of the behavior and opinion questions the same number of labels read and interest 
level were again significantly correlated to the quiz scores. Subjects who reported 
having read more labels scored higher on the quiz than those who read fewer 
labels(r = .39). Subjects who gave higher ratings on how “interesting and 
entertaining” the labels were scored higher on the quiz than those who gave lower 
ratings (r = .30). Cued subjects did report reading significantly more labels (“a little 
more than half”) than did uncued subjects (“a little less than half”) (t = 2.74, p = 
0.008), but did not differ in opinion or behavior from the uncued subjects in any 
other way. 

One explanation for the findings of this study is that the labels used in the second 
study were too long ( 100 words as opposed to 50 words in Study One). This conclusion 
is based on three facts: the cued subjects did not outscore the uncued subjects even 
though they reported reading more labels, all subjects scored better on quiz items 
from the first paragraphs of the labels than the second paragraphs, and the subjects 
in Study Two scored lower than the subjects in Study One. 

It is more likely, however, that the above findings were the result of uncooperative 
cued subjects who read only “a little more than half” of the assigned labels and did 
not perform any better on the quiz than the subjects who were not expecting to be 
tested. The study unfortunately coincided with their final exam week, so perhaps the 
subjects were a bit distracted. To probe this hypothesis 21 undergraduates were asked 
to try to “beat the scores of the graduate students.” They read and studied the label 
text for ten minutes in a classroom setting and then immediately took the quiz (minus 
the visual display items). As suspected the comparison group obtained an average 
score of 17.6 out of 20 (88%) on the quiz compared to the 10.6 out of 20 (53%) for the 
cued subjects. 
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Study Three 

Goals 

The goal of Study Three was to test for an interaction between label length and title 
type. In a two by two design, long labels (100 words) were installed for one week and 
short labels (50 words) were installed during the second week. Half of each length of 
label began with a question while the other half began with a statement. It was 
hypothesized that subjects who were exposed to the short labels might score better on 
test items from the labels with questions than on items from the statement labels 
while subjects exposed to the long labels would not exhibit this difference (mainly 
because they wouldn’t bother to read the labels). However, given the suspicion that 
the subjects in study two had been unmotivated this was not a strong expectation. 

Methods 

Site and Materials. The bird diorama labels used in Study Two were slightly modified 
for use in Study Three as the Long Labels. These labels averaged 94 words in length 
(range: 90 - 100) and had an average Flesh Grade Point ratings of 7.9 (range: 7.0 - 
8.6). Short Labels were created for Study Three by removing information from the 
long labels. These labels averaged 49 words in length (range: 48 to 50) and had an 
average Flesh Grade Point rating of 8.2 (range: 7.5 - 8.9). The labels focused on the 
questions developed in Study Two. Once again, half of the labels were then randomly 
chosen to be “Statement” labels and the question title was shortened into a brief 
descriptive statement on both the short and long versions. 

A 20 item multiple choice quiz was created to test one fact from paragraph (or 
sentence) one and one fact from paragraph (or sentence) two of each label. Twenty 
two undergraduate students and eight adults unrelated to the University pretested the 
quiz and then rated the interest level of the label drafts. The quiz and label copy 
were then adjusted accordingly. Several multiple choice demographic, behavior, and 
opinion items were also included on the quiz: 

1. What is your gender? 

2. Your age: 

3. The highest level of schooling you’ve completed: 

4. Your ethnic background: 

5. Where do you live? 

6. How many of the labels at the bird dioramas on the 2nd floor did you read? 

7. Of the labels you started to read, what portion did you complete? 

8. How interesting were the labels at the bird dioramas that you read? 

9. Did you prefer the labels that began with questions or with short titles? 

10. How long did the labels seem to you? 

11. Please write any other comments that you have about the labels here. 

Subjects and Procedures. A total of 73 graduate students participated in the main 
portion of the study. Each subject was randomly assigned to one of the two treatment 
groups that visited the museum (week one: long labels, week two: short labels). An 
additional 1 8 students from the same course acted as a control group by taking the 
quiz without going to the museum. 

The ten long labels, half with questions as titles and half with statements, were 
installed for one week and then replaced during the second week with the short 
labels, also half with questions as titles and half with statements. Subjects were 
instructed to go to the museum during their assigned week and “spend 45 minutes 
exploring the museum.” All subjects were surprised with the quiz at the end of their 




visit. To check that they were indeed surprised, the following item was included on 
the quiz during Week Two of the study: 

Before receiving this survey, did you know that you would be tested on the 
information presented on the labels at the bird dioramas? 

a. I had no clue that I would be tested on anything written on any labels. 

b. I thought I might be tested on something, but I did not know what. 

c. I knew I would be tested on the bird labels because (please explain) 

Option “a” was chosen by 28% of the subjects and option “b” was chosen by 72%. 



Results 

This time the subjects scored significantly better on quiz items taken from labels 
with questions than on items from labels with statements (F = 10.42, p = 0.002). 
However, while subjects in both the Long Labels and the Short Labels treatment 
groups scored higher than subjects in the control group (F = 10.91, p = 0.000), the two 
treatement groups did not score differently from one another and there was no 
interaction effect between label length and title type. 



Scores of Subjects Exposed to Short Labels and Long Labels 

on Quiz Items from Labels with Questions and Labels with Statements 



Treatment 

Long 

Short 

Control 



Score on Q Items 

4.7 (47%) 

4.7 (47%) 

2.5 (25%) 



Score on S Items 
3.9 (39%) 

4.0 (40%) 

1.8 (18%) 



Once again, subjects scored significantly better on first paragraph items than on 
second paragraph items if they read long labels and scored better on first sentence 
items than second sentence items if they read short labels (F = 10.91, p = 0.000). 

None of the responses to the demographic, behavior, and opinion items varied 
significantly between to subjects who were exposed to long labels and those who 
were exposed to short labels. However, five of the variables were significantly 
related to the subjects’ total test scores. Subjects who scored higher on the quiz: 
reported reading more of the experimental labels, reported that they usually read all, 
rather than just part, of each label, found the labels interesting, had noticed that 
some labels began with questions while others didn’t, and lived in Minneapolis (as 
opposed to St. Paul or the suburbs). In response to the question on whether the 
subject preferred labels starting with questions or with statements 44% preferred 
questions, 21% preferred statements, 20% liked both equally, and 15% said they didn’t 
notice. 



Conclusions and Discussion 

This series of studies has provided evidence that questions can be used on exhibit 
labels to increase learning in museums. In both Study One and Study Three visitors 
had better memory of the information presented on labels that began with questions 
than labels that did not. This result was obtained when comparing a condition where 
all the labels in a gallery began with questions to one where all the labels began 
with brief descriptive titles, as well as under the condition where half of the labels in 
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a gallery began with questions and half did not. These findings held true for both 
longer labels and shorter labels. 

It is the suspicion of the author that questions on museum exhibit labels serve as 
attractors and motivators rather than directors of mental processes. This conclusion 
is based on four pieces of evidence. First, in Study One there was no significant 
difference in the efficacy of the implicit, explicit, and scriptal questions as had been 
found in classroom settings. Second, in all three studies the number of labels 
reportedly read by the subjects was positively correlated to the subjects’ text score 
despite differences in conditions between the three studies. Third, nearly half of the 
subjects in Study Three reported that they preferred labels that began with 
questions. Finally, it just makes sense that a well educated adult should be able to 
remember simple information presented in a clearly written, 50 word passage. This 
suspicion was confirmed by the comparison group in Study Two. 

It may be that the most important factor influencing how much visitors remember 
form the labels they read is how well the labels are written. Labels that are brief, 
clear, and entertaining are easy to understand and remember. The question gets the 
visitor to read the label and learning follows naturally. This is not to say that the 
little psychological tricks that museum use to attract the visitors’ attention to the 
labels are not important. Attention must be given before learning can occur and 
questions have proven themselves effective in attracting attention. It is also not to 
say that all questions are equal. Future studies can explore other typologies to 
determine which ones are more effective. 
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